Estimation of the Duplication History under a Stochastic Model for Tandem Repeats
نویسندگان
چکیده
We present a stochastic model for tandem duplication and substitution mutations that can be used to estimate relative mutation rates and the total number of mutations from a single sequence. Important parameters of the model include the probability of a substitution mutation and the probabilities of tandem duplications of various lengths. Our model indicates that if the probability of substitution mutations is insigni cant, little information can be obtained from a single sequence (which has undergone only tandem duplication). On the other hand, in the presence of both substitution and tandem duplication, one can estimate the parameters of the model, using which we can estimate the total number of mutations of each type. We validate our estimation method via Monte Carlo simulation and show that it outperforms the state of the art algorithm. We also apply our method to the tandem repeat regions in the human genome, where it demonstrates the di erent behavior of microand minisatellites and can be used to compare mutation rates across chromosomes.
منابع مشابه
Neighbor Joining Approaches for Reconstructing Tandem Duplication History
Motivation: Genomes are replete with short sequences repeated consecutively called tandem repeats. Reconstructing duplication histories for tandem repeats may yield valuable insights into their functions and the biological mechanisms of tandem repeat creation and extension. Results: we design and implement a set of heuristic algorithms for reconstructing tandem duplication history with neighbor...
متن کاملGenaralized Neighbor Joining Approaches for Reconstructing Tandem Duplication History: a comparitive study
Motivation: Genomes are replete with short sequences repeated consecutively called tandem repeats. Reconstructing duplication histories for tandem repeats may yield valuable insights into their functions and the biological mechanisms of tandem repeat creation and extension. Results: We study the generalized neighbor-joining approaches for reconstructing tandem duplication history. We develop a ...
متن کاملEstimating Mutation Rates and Sequence Age under a Stochastic Model for Tandem Duplication and Point Mutation
We present a stochastic model for tandem duplications and point mutations that can be used to estimate relative mutation rates and the total number of mutations from a single sequence. Important parameters of the model include the probability of a point mutation and the probability of a tandem duplication of a given length. Our model indicates that if the probability of point mutations is insig...
متن کاملReconstructing the Duplication History of a Tandem Repeat
One of the less well understood mutational transformations that act upon DNA is tandem duplication. In this process, a stretch of DNA is duplicated to produce two or more adjacent copies, resulting in a tandem repeat. Over time, the copies undergo additional mutations so that typically, multiple approximate tandem copies are present. An interesting feature of tandem repeats is that the duplicat...
متن کاملGene Family: Structure, Organization and Evolution
Gene families are considered as groups of homologous genes which they share very similar sequences and they may have identical functions. Members of gene families may be found in tandem repeats or interspersed through the genome. These sequences are copies of the ancestral genes which have underwent changes. The multiple copies of each gene in a family were constructed based on gene duplicati...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017